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(57) ABSTRACT 

A method of mid-level pattern recognition provides for a 
pose invariant Hough Transform by parametrizing pairs of 
points in a pattern with respect to at least two reference 
points, thereby providing a parameter table that is scale- or 
rotation-invariant. A corresponding inverse transform may 
be applied to test hypothesized matches in an image and a 
distance transform utilized to quantify the level of match. 
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Figure IB 







U.S. Patent 


Jul. 3, 2007 


Sheet 3 of 10 


US 7,239,751 B1 


R-Table 
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FIGURE 4 
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Fig. 6A Fig. 6B 
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FIGURE 8A 
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FIGURE 8B 
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HYPOTHESIS SUPPORT MECHANISM FOR 
MID-LEVEL VISUAL PATTERN 
RECOGNITION 

ORIGIN OF THE INVENTION 5 

The invention described herein was made by employees 
of the United States Government and may be manufactured 
and used by or for the Government of the United States of 
America for governmental purposes without the payment of to 
any royalties thereon or therefor. 

FIELD OF THE INVENTION 

The present invention relates to computer vision systems, 
and more particularly to the application of a novel version of 
the Generalized Hough Transform to reduce the computa- 
tional complexity of matching an object depicted in an 
image. 

BACKGROUND OF THE INVENTION 20 

The ultimate goal of computer vision is image under- 
standing, in other words, knowing what is within an image 
at every coordinate. A complete computer vision system 
should be able to segment an image into homogeneous 25 
portions, extract regions from the segments that are single 
objects, and finally output a response as to the locations of 
these objects and what they are. 

Frameworks for image understanding consists of three, 
not necessarily, separate processes. Consider a representa- 30 
five computer vision system as shown in FIG. 1 A. In the first 
process 11, image segmentation is performed, this consists 
of dividing the image into homogeneous portions that are 
similar based on a correlation criterion. Much of the work in 
computer vision has focused in this area with topics includ- 35 
ing edge detection, region growing (clustering), and thresh- 
olding as the primary methods. Image segmentation is 
referred to as the low-level vision (LLV) process. In the 
second process 12 , region extraction is performed. Region 
extraction receives as input, the results obtained during the 40 
LLV stage. With this information region extraction, or the 
intermediate-level vision (ILV) process, attempts to repre- 
sent the segments as single, hypothesized objects. This 
requires the ILV process to search for evidence on the 
desired region using the LLV process’ output. Consequently, 45 
the third process 13 performs image understanding based on 
the extracted regions provided as input. The hypothesized- 
image understanding operation is referenced as the high- 
level vision (HLV) process. 

Most computer vision research over the past 30 years has 50 
focused on LLV processes. Efforts to further the knowledge 
of ILV processes have primarily utilized LLV methods. 
Therefore, it remains an objective of computer vision sys- 
tems to locate low-level image regions whose features best 
support alternative image hypotheses, developed by a high- 55 
level vision process, and provide the levels- and indicators- 
of-match. 

There have been many attempts to solve the ILV problem 
utilizing a Hough Transform methodology. The Hough 
Transform is a particularly desirable technique for use in 60 
vision systems when the patterns in the image are sparsely 
digitized, for instance having gaps in the patterns or con- 
taining extraneous “noise.” Such gaps and noise are com- 
mon in the data provided by LLV processes such as edge 
detection utilized on digitally captured images. The Hough 65 
Transform was originally described in U.S. Pat. No. 3,069, 
654. 
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In an influential paper by D. H. Ballard the Hough 
Transform was generalized for arbitrary shapes, the tech- 
nique was coined Generalized Hough Transform (GHT), 
Generalizing the Hough Transform to Detect Arbitrary 
Shapes (1981). The generalized Hough Transform is a 
method for locating instances of a known pattern in an 
image. The search pattern is parameterized as a set of vectors 
from feature points in the pattern to a fixed reference point. 
This set of vectors is the R-table. The feature points are 
usually edge features and the reference point is often at or 
near the centroid of the search pattern. The, typically Car- 
tesian, image space is mapped into parameter or Hough 
space. To locate the pattern in an image, the set of feature 
points in the image is considered. Each image feature is 
considered to be each of the pattern features in turn and the 
corresponding locations of the reference point are calcu- 
lated. An accumulator array keeps track of the frequency 
with which each possible reference point location is encoun- 
tered. After all the image features have been processed the 
accumulator array will contain high values (peaks) for 
locations where many image features coincided with many 
pattern features. High peaks (relative to the number of 
features in the pattern) correspond to reference point loca- 
tions where instances of the pattern occur in the image. The 
Hough Transform can be enhanced by considering rotated 
and shortened or lengthened versions of the vectors to locate 
instances of the pattern at different orientations and scales. 
In this case, a four dimensional accumulator array is 
required and the computation is increased by two orders of 
magnitude. The key contribution of the GHT is the use of 
gradient vector data to reduce the computation complexity 
of detecting arbitrary shapes. Unfortunately, the method’s 
time and space complexity becomes very high by requiring 
the entire search of a four-dimensional Hough parameter 
space. For rotation- and scale-invariance, the GHT method 
requires a priori knowledge of the possible rotations and 
scales that may be encountered. More recent procedures that 
provide either or both of rotation and scale invariance using 
the Hough Transform include: 

The work of Jeng and Tsai, Fast Generalized Hough 
Transform (1990), which proposes a new approach to the 
GHT where transformations are applied to the template in 
order to obtain rotation- and scale-invariance. The R-Table 
is defined as in the original GHT technique. Scale-invariance 
is provided by incrementing all the array positions of a 
Hough parameter space using another table called the SI- 
PSF. For rotation-invariance each position in the SI-PSF 
with a non-zero value generates a circle with its center at the 
reference point; a radius equal to the distance between the 
reference point and this position of the SI-PSF is calculated. 
Subsequently, these circles are correspondingly superim- 
posed onto each image point in order. Obviously, each image 
point requires a high number of increments; the computa- 
tional complexity of this method is very high if the template 
and the image shapes have a large number of points. 

The disclosure of Thomas followed trying to compress the 
Hough parameter space by one degree of freedom to obtain 
the location of arbitrary shapes at any rotation, Compressing 
the Parameter Space of the Generalized Hough Transform 
(1 992). This method considers a set of displacement vectors, 
{r}, such that each edge pixel with identical gradient angles 
increments positions in one plane of the parameter space. 
Thus, the original four-dimensional Hough parameter space 
of the GHT reduces to 3 -dimensions. As a result, the 
technique is not scale-invariant and requires the same pro- 
cessing complexity as performed in the GHT. 
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Pao, et al., Shape Recognition Using the Straight Line 
Hough Transform (1992), described a technique derived 
from the straight-line Hough Transform. A displacement 
invariant signature, called the STIRS, is obtained by sub- 
tracting points in the same column of the STIRS space. 
Subsequently, template and image signatures are compared 
using a correlation operator to find rotations. Unfortunately, 
scale-invariance is not provided for since it must be known 
a priori. Experiments show that the STIRS does not work 
well when different shapes appear in the image. 

A new version to the GHT called the Linear GHT 
(LIGHT) was developed by Yao and Tong, Linear Gener- 
alized Hough Transform and its Parallelization (1993). A 
linear numeric pattern was devised, denoted the vertical 
pattern, which constitutes the length of the object along the 
direction of a reference axis (usually the y-axis). The authors 
state that rotation- and scale-invariance is handled, using 
this new method, in much the same way it is performed by 
the GHT. Clearly, the same deficiencies exist for this method 
as it requires a large Hough parameter space and the a priori 
knowledge of the expected rotations and scales. 

The effort of Ser and Sui, A New Generalized Hough 
Transform for the Detection of Irregular Objects (1995) 
describes an approach that merges the advantages of the 
Hough Transform and that of a technique called contour 
sequencing. The calculation of the contour sequence 
requires that an entire object’s perimeter be available and 
not occluded. Thus, if a portion of the desired object is 
occluded, for instance — a noisy image, this method will fail. 

Aguado, et al., Arbitrary Shape Hough Transform by 
Invariant Geometric Features (1997) approached the prob- 
lem of region extraction by using the Hough Transform 
under general transformations. Even though this method 
provides for rotation- and scale-invariance, it comes at a 
complexity cost for derivations of shape-specific general 
transformations, also required for translation-invariance as 
well. 

The most recent work by Guil, et. al. presents an algo- 
rithm based on the GHT which calculates the rotation, scale, 
and translation of an object with respect to a template, A Fast 
Hough Transform for Segment Detection (1995). The meth- 
odology consists of a three stage detection process and the 
creation of five new tables. Three of the tables are con- 
structed for the template, the remaining two are used against 
the image. The first stage of the detection process obtains the 
rotation, the next gathers the scale, and finally the translation 
is found in the third 

The complexity of this method is clearly high as the image 
and template are repeatedly tested using different tables to 
obtain the invariant values. Furthermore, the results of a 
previous stage are used to obtain the answer to the next 
stage, hence, if a previous stage fails the next one will also. 
The use of gradient angles is appropriate, however, dividing 
the original R-Table into five tables to obtain the desired 
invariance’s has added unnecessary complexity to the prob- 
lem. 

SUMMARY OF THE INVENTION 

It is therefore an object of the invention to provide an ILV 
process utilizing a rotation-, scale- and translation-invariant 
Generalized Hough Transform, and furthermore to reduce 
the computational complexity inherent in the GHT. 

It is another object of the invention to provide an ILV 
process that may be applied to different types of LLV results 
in computer vision applications. 
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It is yet another object of the invention to provide a 
transform inversion that may be used with the ILV process 
to verily proposed matches with a search pattern. 

The technique of the invention is based on the GHT, and 
5 solves the problems of rotation- and scale-invariance that 
plaques the generalized Hough Transform while providing a 
technique that solves the ILV problem. 

The Hough Transform is utilized in hypothesis testing. 
When testing the hypothesis of an object in an image, a 
to considerable number of sub -hypotheses are generated in 
order to extract the correct region in the image. This makes 
the problem of region extraction one which is known as 
combinatorial optimization. Many combinatorial optimiza- 
tion techniques have been used to solve the region extraction 
15 problem, some are genetic algorithms, simulated annealing, 
and tabu search. None of these methods guarantee finding 
the correct or exact solution, only the best possible one (after 
a number of iterations of the given algorithm). 

Unlike the methods listed above, the novel version of the 
20 generalized Hough Transform employed in the invention, 
called the Pose-Invariant Hough Transform ( PIHT ), does 
not require the generation of numerous sub -hypotheses to 
locate the desired region for extraction. Instead, a new 
version of the R-Table, called the J-Table, is provided with 
25 rotation- and scale-invariance built-into the table (i.e., 
hypothesis). The novel PIHT method with its new J-Table is 
invariant to rotation or scale differences of the desired object 
in the image. This alleviates the need of generating sub- 
hypotheses and eliminates the complexities associated with 
30 it. Furthermore, the invention can use the results of the PIHT 
and perform the region extraction or indicator-of-match 
required by the ILV process. Hence, an entirely new tech- 
nique is developed, called the Inverse-Pose-Invariant 
Hough Transform ( IPIHT) which executes the indicator-of- 
35 match. 

BRIEF DESCRIPTION OF THE DRAWINGS 

The invention may be more fully understood from the 
40 following detailed description, in conjunction with the 
accompanying figures, wherein: 

FIG. 1A is schematic chart of the position of the Inter- 
mediate Level Vision Recognition step in a computer vision 
system. 

45 FIG. IB is a schematic chart depicting application of the 
present invention to facilitate Intermediate Level Vision 
Recognition in a computer vision system. 

FIG. 2 is a diagram of the application of the GHT to create 
a hypothetical R-table for object. 

50 FIG. 3 is a diagram of the application of the Pose- 
Invariant Hough Transform utilizing two reference points. 

FIG. 4 is a diagram of matching pairs of gradient angles 
applied to the image of FIG. 3. 

FIG. 5 is a diagram of the geometry of the Inverse-Pose- 
Invariant Hough Transform applied to the image of FIG. 3 . 

FIG. 6A illustrates a collection of images utilized in 
testing the invention with respect to an arbitrarily shaped 
object pattern. 

60 FIG. 6B illustrates the hypothesis for the arbitrarily 
shaped object pattern. 

FIG. 7 illustrates a surface plot of the Hough parameter 
space for the arbitrarily shaped object pattern of FIG. 6. 

FIG. 8A is a flow chart of the Pose-Invariant Hough 
65 Transform of the present invention. 

FIG. 8B is a flow chart of the Inverse-Pose-Invariant 
Hough Transform of the present invention. 
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FIG. 8C is a flow chart of the present invention including 
the Distance Transform and Matching Metric steps. 

DETAILED DESCRIPTION OF THE 

INVENTION 5 

The invention is addressed to the Intermediate Level 
Vision problem of detecting objects matched to a template or 
pattern. As shown in FIG. 2 , the usual GHT defines the 1Q 
pattern by using an arbitrary point 20 preferably in a 
relatively central location, and measures the distance r and 
edge direction co from that point. The edge direction is then 
characterized in the form of a reference point angle |3 and 
distance r, to construct a look-up table, or R-Table, giving 15 
for each edge direction, the distance/angle displacements 
from the reference point that can give rise to the template 
point. To apply the transform, the edge direction is measured 
at each LLV selected image point and accumulator cells 
indexed by the look-up table are incremented. The accumu- 20 
lated evidence in Hough parameter space is reflected as 
strong peaks indicating possible matches. Unfortunately, as 
parameters of rotation and scale are added, additional 
dimensions must be added to the accumulator space and 
computational issues are exponentially complicated. 

In most cases the rotation and scale of an object is 
unknown. What is desired is a method which can overcome 
the limitations of the GHT while achieving an improved 
space complexity. To accomplish this, a variant to the GHT 30 
is disclosed that has the invariance “built-into” the R-Table. 
Hence, with a single template denoted as the hypothesis of 
the desired object, invariance is built-into the R-Table allow- 
ing the variant GHT technique to locate objects regardless of 
their rotation, scale, and translation in the image. The 35 
invariance to rotation, scale, and translation is defined as 
pose and the novel variant to the GHT is referred to as the 
Pose-Invariant Hough Transform ( PIHT ). 

In the new approach, two-reference points are defined and 
two points from the pattern and image are used, instead of 40 
one. The use of pairs of points from the pattern aids in the 
parameterization of the arbitrary pattern shape and builds 
into the R-Table rotation- and scale-invariance. The two- 
reference point R-Table, with pose-invariance included and 
created from an object or pattern hypothesis, is denoted as 45 
the J-Table. Along with the new J-Table, a derivation of 
formulas using the two-image point concept (exploiting the 
J-Table’ s invariance) is provided which finds the desired 
pattern in the image. 

Note that the J-Table provides a formalization of the HLV 50 
process. Consider that the R-Table used by the GHT is a type 
of hypothesis formalism for the HLV. R-Tables essentially 
provide hypotheses 16 that are tested against the image-data, 
as shown in FIG. 1 A. Unfortunately, these hypotheses must 
be correct or exact, since the R-Table is created with fixed 55 
rotation and scale. On the other hand, the J-Table allows for 
not necessarily exact hypotheses, but still allows the desired 
object to be found using the PIHT algorithm as shown in 
FIG. IB. 

Referring now to FIG. 3 as a diagrammatic example of the 60 
two-reference point concept, because of the two points (R x 
and R 2 ) instead of one, the distance, S^, between reference 
points R l and R 2 can essentially describe the scale. Further- 
more, the figure can define the rotation angle of the hypoth- 
esis, 0 ^. Thus, 65 

Ri = ( x Ry >7?]); ti 2 =(xR 2 , y Rl ). 
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Scale is equal to the distance between the two-reference 
points. This is easily calculated using the following distance 
formula, 

5^=i/(xr2-Xri ) 2 +CKr 2 - ^Ri ) 2 ■ (1) 

Rotation is also calculated using the reference points. Recall 
from trigonometry that the tangent angle, 0, can simply be 
defined by taking the inverse or arc tangent’s of both sides. 
The rotation formula, Equation 2, is achieved: 



The standard settings for 8^=1 and 0^=0° or 0^=0 radians 
(rads). 

To provide the rotation- and scale-invariance desired, the 
two-image point concept mentioned earlier is used. The 
two -image point concept effectively means a collection of 
two-boundary point pairs on an object are used to store the 
pose-invariant information into the J-Table (hypothesis), as 
well as calculate pose data from the image which is com- 
pared to the contents of the J-Table (detection). If the 
arrangement of any two boundary points is used, there 
would be at most 



combinations of pose-invariant results which equates to a 
J-Table with as many rows. Since the basic methodology of 
the GHT is to examine each row of the table, the complexity 
for this method would be in the order of 0(2”). In practice 
this is clearly inefficient, especially if n were very large. A 
more attractive method utilizes only a relevant (i.e., lesser) 
subset of points. This can be accomplished by considering 
gradient angles and still using the two -image point concept. 

The underlying approach is to handle two-boundary 
points connected by a line that are parallel to each other in 
terms of their gradient angle. Refer to FIG. 4 containing an 
object of unknown rotation, unknown scale, and unknown 
translation. The rotation- and scale-invariant J-Table is 
defined by regarding this geometric model. The angle 
formed from the line connecting points i and j, a the 
direction angle, to the gradient at either i or j provides the 
unique Parallel Gradient Identifier Angle, (|), or parallel angle 
for short. The parallel angle is used as the primary index 
value of the new J-Table, denoted § h . The parallel angle, 
is calculated from the difference between the gradient angle, 
\\> h , where an d the direction angle, a h , for points 

i and j. Thus, 

<bh=^h~Vh ( 3 ) 

The parallel angle, as shown on FIG. 3, allows the unique 
identification of identical pairs of gradient angles, xp, sepa- 
rated by different point distances, S h , and having different 
direction angles, a. 

As shown on FIG. 4 , points i and j have the same gradient 
angle, xp, as points i and q. Even though these two pairs of 
points have matching gradient angles, they are clearly dis- 
tinguished from each other by the use of the parallel angle 
methodology. It is obvious from the figure that the parallel 
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angle for points i and j, ^ iJ9 calculates to a value different 
than the parallel angle for points i and q, <j> 1 . , such that 
Consequently, each of these pairs will map to a 
unique entry in the J-Table. Consider for a moment how the 
parallel angle, is obtained from the gradient angle, %, 5 
and the direction angle, o h . 

With gradients of positive slope tJj values are in the range 
[0, Jt/2]. When the direction angle, a, is greater than Jt/2 but 
less than jt, the -a angle is used in the calculation that is, 

1C 

- 0 = 0 - Jt. 

Thus, in these situations <|)=qj-(-a) or (|)=i{H-a. On the other 
hand, with gradients of negative slope values are in the 
range [jt/2, jt]. When the direction angle, a, is less than ixJ2 
but greater than zero, a is used directly in Equation 3. 1 

These observations of the gradient angle and the direction 
angle lead to the following additional remark for cr. The 
direction angle, o h , can be calculated by: 



Therefore, with these observations the parallel angle is 25 
obtained from the result of a h given above (Equation 4) and 
simply using Equation 3. As a result, Equation 3 provides the 
new, unique index into the J-Table. 

To impart scale-invariance, note that the distance between 
the two-parallel image points is S A . Using the mid-point, 30 
(x m , y m ), between the two-parallel boundary points, (x*, y t ) 
and (x., yj), Equations 5 through 8 are used. 

P;=^ (ix jR -x m ) 2 +(y R -y m ) 2 (5) 

35 


where, 


x m = 


X; + Xj 
2 


(7) 


y m = 


yi + yj 
2 


( 8 ) 


Scale-invariance occurs after normalization of the radial 
distance, p z , portion of the P z vector by point distance, S A . 
This results in where p^ is defined as in Equation 

9 and $> h is defined as in Equation 10. 


PSh 


Pi_ 

s h 


(9) 


S»W(* r x j ) 2 +0' i -y j ) 2 . (10) 

Rotation-invariance is regarded when the parallel angle, 
(J)^, is calculated from a line passing between the two points 
i and j, to one of the gradient angles, or i|y, where 

Consequently, for all combinations of parallel gradient 
image points, i and j, with direction angle, (^., if the parallel 
angle of points i and j, equals the parallel angle of the If* 

row of the J-Table, that is, this indicates that the 

parallel gradient combination of i and j may belong to the \i th 
combination of the J-Table. Hence, if the desired object is 
rotated by 6, all gradient angles (parallel or otherwise) and 
radial vectors are also rotated by 6 that is, 

( 11 ) 

With this information the new J-Table, denoted J(c|), s£), is 
shown in Table 1 . 


TABLE 1 

J-Table Concept 

PARALLEL 

ANGLE 

(*b) Set^, 

h 0 Hp s (^o), 0o(y>o)] o , S h (^o), o- Q {(fo)\ ... ; ho \p s (<po), 0k(y?o)] k , S h (yo), cr k (^o)} 

< * >1 hollPs(Vs), 0oM\ o , S h (ys), <To(¥>s); ... ; h k [psW, 0 s (^ k )] k , S h (^s), cr k (^)} 

^ hoilPs(^n), 9 0 (<po)\ 0 , S n (^n), cr G (^n); . . . ; hfc [p s (^n), 0 k (^n)] k , S n (^n), cr k (^n)} 

Parallel Angle, (J^, where h = 0, . . . , n. 

Set P p = {[Ps hg (^h), $(«,)], S hg (^), | e = 0, . . . , k; i = 0, . . . , k; h = 0, . . . , n}, 


where p = 0, . . . , n. 
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From Table 1, the mapping is vector- valued. Notice that 
the J-Table contains two additional entries for each row, the 
distance between points in the hypothesis, S A , and the 
direction angle, a z . 

For detection, if the parallel angle, tyy, formed by the line 
passing between points i and j (where is equal to the 

parallel angle of the h th row of the J-Table, (j)^, then these 
two-points may belong to the h th row of the table. Conse- 
quently, each reference point of the two-reference point 
concept can be calculated as follows: Let (x^, y^) designate 
the location of the reference point in the image, [p Sfi 0 J is the 
vector with parallel angle, and the mid-point is (x w , y m ) 
between i and j with point distance in the image, S„ hence, 
the reference point equations from the GHT now become: 


^=^+[(P5 a ( < |)a)^) COS (0z (<!>/,)+&)] 

(12) 

yR = y m +[{ps h {^h)^ sin(0 z (<t>/ 2 )+6)] . 

(13) 


To symbolize the pose-invariance above, let the radial 
vector now be denoted as such that 

PpHPs h *S t , 0 P ] 

where 



0p=0 I +6. (15) 

Once the two-reference points are selected, the J-Table is 
formed with respect to these two points. The J-Table now 
contains the data aiding in the pose-invariance of the PIHT 
procedure which may be executed as follows as reflected in 
steps 35, 37, 39, and 41 of FIG. 8A. 


1 . Initialize Hough parameter space accumulator, A[] [] = 0. 

2. For gradient angles, ip, from 0° to 179°: 

a. For each pair of edge points with the same gradient angle, ip^ 

i. <p if = ipi - a if . // Calculate <j) if between edge points. 

ii. For each row of the J-Table, h: 
l.If<|> if =<l> h : 

a. (Use Equations 11 through 13 in sequence.) 
b- A[x R ][y R ] = A[x r ] [y R ] + 1. 

3. Any two relative peaks in the accumulator array satisfying Equations 
16 and 17, below, indicates the position of the desired object. 


s_ _ V(X2-Aj ) 1 2 3 + (r2-ri ) 2 (17) 

5 Sr (•% _ ) 2 + 0^2 _ i ) 2 

Therefore, a rotation- and scale- (i.e., pose-) invariant 
Hough Transform has been developed and presented in great 
io detail. This new variant exploits the geometry that accom- 
panies the idea of the two-reference and — image point 
concepts used. Furthermore, a new version of the R-Table, 
called the J-Table, was derived which contains pose-invari- 
ance of a hypothesized object and aids the execution of a 
procedure specifically designed for use with the table. 

The two -reference points located by the PIHT, alone, 
meet the requirement for an indicator-of-match that the 
hypothesized object is actually in the image. However, this 
20 solely does not provide the requisite level -of-match. A 
supplemental operation using the recently found reference 
points can provide a delineation of the desired region based 
on these two reference points (i.e., explicit indicator-of- 
25 match). 

Since the Hough Transform is just that, a transform, this 
implies that a reverse or inverse operation can be executed 
on it. Consequently, in this section the details on an Inverse- 
Pose-Invariant Hough Transform ( IPIHT) are provided. The 
30 IPIHT uses the results from the PIHT and in conjunction 
with the hypothesis (i.e., J-Table) generates a delineation 
corresponding to the desired region, thus, providing a more 
explicit indicator-of-match. Additionally, another trans- 
35 form — known as the Distance Transform — is utilized to 
provide a quantitative level -of-match between the extracted 
region and the actual region in the image. 

FIG. 5 depicts the geometry involved and initial calcula- 
tions performed by the IPIHT algorithm. 

40 For each row entry member of the J-Table, the IPIHT 
effectively superimposes the elements of the set, <=£ p , onto a 
reference point, as shown in FIG. 4. Once an element of the 
set, < 3 £ p , is virtually superimposed onto a reference point, R, 
45 the first set of calculations obtains the mid-point, (X m , Y m ), 
of the desired boundary point pair, (X z , Y z ) and (X^, Y,). To 
achieve this note that the radial distance, p , and radial 
angle, 0 Z ., of the radial vector, P^, are manipulated. Since p 
is normalized by S A and the recognized object maybe at a 
50 larger or smaller scale than the hypothesis, to obtain the 
correct location for (X m , Y m ) requires the following calcu- 
lation of p^ (at the index, of the J-Table); 


Global maxima in the accumulator array indicate the 
translation of the two-reference points; however, the rotation 55 
and scale of the object remain unknown. To find rotation 
simply use the following equation with the two candidate 
reference points, denoted (X l5 Y x ) and (X 2 , Y 2 ): 


0 = tan 1 


Y2-Yi\ 

x 2 -xj 


(16) 


65 

Scale is obtained by also using previously defined formulas 
merged together to provide a scale ratio, 


Pfityk) = (Ps h (<Ph) X*h(<Pf i)) 



(18) 


Also consider that the recognized object may be at a 
rotation greater than or equal to the 0° rotation of the 
hypothesis. Thus, the radial angle portion, 0 l5 of V p must be 
rotated by the rotation angle, 0^. This new angle must then 
be rotated an additional 180° to direct p^, to the mid-point, 
(X m , Y m ), for boundary point pairs, (X z , Y z ) and (Xj, YJ). 
Consequently, 


0^i( < bi) = 0j(<l>A)+0tf+ :Jt - 


(19) 
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With Equations 1 8 and 1 9, the calculation of the mid-point, 
(X m , Y m ), associated with elements of the set, <=£ p , from row 
§ h of the J-Table, is obtained by: 

X m =X n +[p^ h ) cos(0^(j) A ))]; for n=l and 2 (20) 

Y*=Y»+[ Pvtok) sin(0^))]; ^ n=l and 2 (21) 

Now that the mid-point, (X ffl , Y m ), has been obtained using 
the S h element of L p and the scale ratio, 


s 

SR’ 


the boundary point pair, (X z , Y z ) and (Xj, YJ, associated with 
(X m , Y m ) can be acquired. 

Given (X m , YJ, acquiring (X z , Y z ) is a matter of utilizing 
not only S A and 


s 

SR’ 


but also the rotation angle, 0^. Once again, since the 
recognized object maybe at a larger or smaller scale than the 
hypothesis, each S A value must be scaled to the proper size 
for the desired object. However, the results of Equations 20 
and 21 have effectively situated the current state of calcu- 
lations at the center of this distance. Accordingly, only half 
the distance is needed to find (X z , Y z ) and (X,-, Y J from the 
coupled mid-point, (X m , Y m ), hence, 


1 ( S \ (22) 

SBiVh) = 

Since a positive angle of rotation is always assumed, all 
boundary points on the desired object are rotated by 0^, with 
respect to their corresponding boundary points in the 
hypothesis. Furthermore, all boundary point pairs are match- 
ing gradient angles, and thus, are always 180° apart from 
each other. Nonetheless, recall from Section II- A that these 
boundary point pairs are rotated by a h . Therefore, to obtain 
the correct angles that will point S B to boundary point pairs, 
(X z , Y z ) and (X^, YJ, from an associated mid-point, (X m , 
Y m ), requires the following consideration: 

e*(<y=<Y<l> A )+0* (23) 

where a z (^) is the direction angle element extracted from 
the J-Table, of the set, <£ p , at row (^To aim to the opposing 
boundary point requires a 1 80° rotation, 

QB-x(<bh) =Cf i(<bh)+ Q R+X- (24) 

As a result, the boundary point pairs are established by the 
following formulas: 


Xj=x m+ [S B (< y cos(e B «*))] 

(25) 

Yj=Y m +[S B {< h) sin(0* «*))] 

(26) 

X^X^S^i h) cos(0* t (<|> A ))] 

(27) 

Yi=Y m =[S B {< h) sin(0* t (<|> A ))]. 

(28) 


12 

Using the same J-Table and Equations 18 through 28, the 
IPIHT algorithm is executed as follows and reflected in steps 
45, 47, 49, 51 and 53 of FIG. 8B. 

5 


1. Obtain the J-Table (i.e., hypothesis) representation of the object from 
the PMT (Table 1.) 

2. For each reference point, n, where n = 1, 2: 

a. For each row of the J-Table, h, where (j^ is an index: 

10 // First, find mid-point, (X m , YJ. 

i. (Use Equations 18 through 21 in sequence.) 

// Next, find boundary point pair, (Xj, Yj). 

ii. (Use Equations 22, 23, 25, and 26 in sequence.) 

// Now, find boundary point pair, (X b YJ. 

iii. (Use Equations 24, 27, and 28 in sequence.) 

^ iv. Save locations, (X b YJ and (Xj, Yj), into the set of located 

boundary points, 3, where 
3= {(X k , YjJ I k = 0, . . . , n} where n is the total 
number of boundary points stored in the J-Table. 

v. Visually identify locations, (X i; YJ and (Xj, Yj), in the image as 
boundary points of the desired object. 

3. The resulting delineation provides an explicit indicator-of-match 
20 between the extracted object and the actual object in the image. 


Distance Transform 

A natural quantitative measure of match between an 
25 extracted region and the actual desired region is a metric 
equal to zero for a perfect match that gets larger as the 
boundaries become further apart in terms of Euclidean 
distance. 

In order to evaluate the similarity between an extracted 
30 region and the desired region in an image a correspondence 
between points on both curves is needed. Due to discrete 
values in digital images and noise that may exist at edge 
pixels, it is unnecessary to waste time and computational 
effort computing exact Euclidean Distances. An appropriate 
35 measure that overcomes these problems is the Distance 
Transform, also known as Chamfer Matching. 

The Distance Transform (DT) essentially alters an edge 
image, consisting of object-edge and background pixels, into 
a gray-level image which denotes the distance each back- 
40 ground pixel has to the nearest object-edge pixel. The new 
image produced by the DT is called a distance image. The 
DT can be computed by the use of several masks, a mask 
allows for a 2D transformation to occur; the most common 
is a 3x3 mask which obtains the Chamfer- 3 /^ distance. 

45 To obtain the actual Matching Metric (i.e., level-of- 
match), M, a correspondence is needed between the 
extracted region and the distance image, D. The DT has been 
shown exceedingly appropriate when used with a Matching 
Metric to correspond edge images. Recall that the PIHT/ 
50 IPIHT framework ultimately provides a set of located 
boundary points, 3 

Hence, the Matching Metric, M, is computed by projecting 
each xy-coordinate pair from sets 

55 onto the corresponding (x, y) location of the distance image, 
D. By transversing the projection of the coordinate pairs, 
(X k , YJ, an average cumulative sum of the distance value at 
each (x, y) location in D is obtained. The average cumulative 
sum is the thus defined as the Matching Metric, M. Conse- 
60 quently, a perfect fit between the sets 

and the distance image, D, is a Matching Metric result of 
zero. 

The Root Mean Square (RMS) Average obtains drasti- 
65 cally fewer false minima than other functions that might be 
used as Matching Metrics. The RMS Average is defined as 
follows: 
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where d k are the distance values from the distance image, D, 
and n is the number of coordinate pairs, (X k , Y k ), from the 
set 3. 

Now, a complete procedure for implementing the ILV 
process can be presented. This procedure includes the PIHT/ 
IPIHT framework for finding reference points and delineat- 
ing regions, the creation of a distance image from a DT, and 
subsequently, the calculation of a Matching Metric between 
an extracted region and the actual desired region from an 
image. 

PIHT/IPIHT Framework with DT and Matching Metric (ILV 
Process) 

1 . Execute PIHT algorithm as shown in FIG. 8A. 

2. Execute IPIHT algorithm, as shown in FIG. 8B. 

3. Execute DT algorithm as reflected in step 59 of FIG. 8C. 

4. Execute Matching Metric, M 
A. For k=0 to n: 

(i) Get (X*, Y*) e3 
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Clearly, this did not affect the PIHTs ability at indicating the 
existence of the desired arbitrary shape, by locating its 
resultant reference points. 

In this case, a Matching Metric, M, was achieved at 1 . 158. 
5 The upper portions of the object were not quite outlined 
exactly, thus, causing the M value to increase farther from 
zero. Nevertheless, the desired region was extracted; M 
values of less than 4.0 still indicate acceptable recognition. 

Analysis of Capabilities 

In this section a tabular analysis of several experiments 
performed is presented. The tabulated data provides two 
measurements: Absolute M, and Percentage Error from Zero 
(PEZ). Absolute M, is simply the Matching Metric obtained 
15 for that particular test. PEZ is calculated based on the 
worst-case delineation of an extracted region. Recall M is 
calculated by summing all the coordinate pairs using Equa- 
tion 29. If the worst-case delineation is assumed, this 
correlates to d^’s projecting to the highest possible values 
20 (i.e., 255) in a distance image. Hence, Equation 29 now 
happens to be, in the worst-case: 


1 j(255) 2 *n 
25 W n 


(ii) Get d k at (X k , Y k ) from distance image, D. 


Essentially, the worst-case M now becomes 85.0 for all n. 
Consequently, PEZ becomes: 



PEZ = 


( Absolute M 
l Worst -Case M 


100 . 


(30) 


B. Result is M which quantitatively enumerates how close 
the extracted region from PIHT/IPIHT framework fits 
the actual desired region from the image as reflected in 
step 61 of FIG. 8C. 

Therefore, starting with a high-level hypothesis and using 
the image data, the ILV process detailed above obtains the 
level- and indicator-of-match, all required and thus, meeting 
the goal of this work. 

EXAMPLE— DUALTEST 

The efficiency of the PIHT/IPIHT framework depends on 
several factors, including: Quality of segmentation, accu- 
racy of gradient determination, and validity of Hough Trans- 
form capabilities. To determine if any or all of these factors 
affect performance of the framework the example described 
below and in FIG. 6 is provided: 

This example presents an image which contains both the 
quadrilateral and arbitrary shape, along with other shapes 
within it. The purpose is to review the ILV implementation’s 
capability of finding one shape, when other shapes are also 
present in the image. In other words, this can be regarded a 
synthetic-experimental-example of a real-world image. 

FIG. 6A shows the DUALTEST image along with FIG. 
6B illustrating the hypothesis for the arbitrary shape. Note 
that the desired object for extraction (i.e., arbitrary shape) is 
rotated and unsealed from its hypothesis. 

A surface plot represents the Hough Parameter Space 
(HPS) graphically enhanced, this is shown in FIG. 7. 

Notice that the quadrilateral shapes (i.e., rectangle and 
square) are not recognized, nonetheless, the other arbitrary 
and analytical shape did cause some false voting to occur. 


Larger PEZ values indicate how far from zero, or close to the 
worst-case possible, the absolute M was obtained. Clearly, 
PEZ values close to or at 0% are preferred. 

In Table 2, the tabulated results for the simple experi- 
ments, both quadrilateral and arbitrary (neither exclusively 
circular nor quadrilateral), and Special Case are shown. 


TABLE 2 


Quadrilateral, Arbitrary, and Special Case Experiments Absolute and 
Percentage Error Results 



UR-US 

R-US 

UR-HS 

UR-DS 

R-HS 

R-DS 

QUAD 

Absolute 

0.818 

0.715 

0.822 

0.578 

0.709 

1.000 


PEZ (%) 

0.962 

0.841 

0.976 

0.680 

0.834 

1.176 

ARB 

Absolute 

0.750 

1.082 

1.056 

0.950 

0.882 

1.021 


PEZ (%) 

0.882 

1.273 

1.242 

1.118 

1.038 

1.201 

Special 

Absolute 

0.871 

1.447 

0.929 

1.261 

1.076 

1.582 

Case 

PEZ (%) 

1.025 

1.702 

1.093 

1.484 

1.266 

1.861 


NOTE: 

The legend for the table above and which subsequently follow is: QUAD, 
Quadrilateral object; ARB, Arbitrary object; UR-US, Unrotated and 
Unsealed; R-US, Rotated and Unsealed; UR-HS, Unrotated and Half- 
Scaled; UR-DS, Unrotated and Double- Scaled; R-HS, Rotated and Half- 
Scaled; R-DS, Rotated and Double- Scaled. 

As expected, the quadrilateral results obtained low abso- 
lute values obviously corresponding to low PEZ percent- 
ages. The best test case achieved was the unrotated and 
double-scaled object, while the highest numbers came from 
the rotated and double-scaled experiment. Overall, with 
results obtaining error percentages less than or close to 1%, 
clearly indicates that each of these objects (however differ- 
ently posed from the original hypothesis surmised) was 
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recognized by the ILV implementation. For the Special Care 
test experiments, although most error percentages are defi- 
nitely above 1% and two closely reaching 2% (R-US and 
R-DS) this still remains a very good indication of object 
recognition. Recall that the Special Case test used a hypoth- 5 
esis that was not necessarily exact in shape to the object 
desired for extraction. Even so, there was no anomalous (i.e., 
wildly high absolute and PEZ value) test case documented. 
Thus, this situation proves the ILV implementation’s capa- 
bility of recognizing objects even when hypotheses are not to 
identical to the desired object. 

Table 3 provides the data on the DUALTEST image, both 
results on recognizing the quadrilateral and arbitrary object. 
Once again, error percentages are obtained less than or close 
to 1%. 15 


TABLE 3 

DUALTEST Experiments Absolute and Percentage Error Results 


DUALTEST 


QUAD 

Absolute 

0.822 


PEZ (%) 

0.967 

ARB 

Absolute 

1.158 


PEZ (%) 

1.362 


This indicates the PIHT algorithm and associated frame- 
work is capable of recognizing a specific object in an image, 
even when there are other shapes present. This innovation 
can be used by companies providing remote sensing data 30 
and capabilities to identify areas or locations on the Earth’s 
surface. The innovation provides a straightforward method- 
ology of using a single template of the desired object for 
recognition, the object can be located, regardless of its scale, 
rotation, or translation difference in the image. 35 

Although preferred embodiments of the present invention 
have been disclosed in detail herein, it will be understood 
that various substitutions and modifications may be made to 
the disclosed embodiment described herein without depart- 
ing from the scope and spirit of the present invention as 40 
recited in the appended claims. 

I claim: 

1. A method of recognizing a pattern in an image com- 
prising the steps of: 

(a) receiving data characterizing the pattern in pattern 
coordinate space; 

(b) selecting feature points for the pattern; 

(d) transforming the pattern from pattern coordinate space 

to a parameter space by creating a parameter table 5Q 
characterizing the pattern wherein pairs of feature 
points of the pattern are parameterized as a set of 
vectors with respect to at least two reference points; 

(e) receiving data representing the image; 

(f) extracting points of interest from the image data 55 
utilizing a low level vision process; 

(g) initializing a parameter space accumulator comprising 
an array of cells; 

(h) selecting pairs of extracted points parallel to each 
other in respect of their gradient angle and parameter- 60 
izing the pairs of extracted points; 

(i) comparing the values computed for pairs of extracted 
points with the parameter table and incrementing the 
cells of the parameter space accumulator corresponding 

to matching parameter values; and 65 

(j) processing relative peaks in the accumulator array to 
determine a match with the desired template object. 


2. The method of claim 1 wherein a distance between the 
pairs of feature points is calculated and is a parameter 
included in the parameter table. 

3. The method of claim 1 wherein the pairs of feature 
points are parallel to each other in terms of their gradient 
angle. 

4. The method of claim 1 wherein the parameter table 
characterizing the pattern is invariant to rotation, scale and 
translation. 

5. The method of claim 3 wherein a parallel gradient 
identifier angle is computed for a pair of feature points as 
follows: 

a line connecting the pair of feature points has a direction 
angle a, 

the gradient angle at each of the feature points is assigned 
the value ip, and 

the parallel gradient identifier angle § is computed by 
subtracting the direction angle from the gradient angle 
so that (|)=a|j-a. 

6. The method of claim 5 wherein scale-invariance is 
included in the parameter table by normalizing the radial 
distance portion of the vector connecting a midpoint of the 
line between the pair of feature points to one of the at least 
two reference points, and including the normalized radial 
distance in the parameter table. 

7. The method of claim 5 wherein the distance between 
the pair of feature points is computed and included in the 
parameter table. 

8. The method of claim 5 wherein the direction angle a 
between the pair of feature points is a parameter included in 
the parameter table. 

9. The method of claim 5 wherein the parallel gradient 
identifier angle (|) of the pair of feature points serves as an 
index for the parameter table. 

10. The method of claim 1 wherein a match determined 
with the pattern is processed as a hypothesized match, and 
the parameter table entries corresponding to the relative 
peaks in the accumulator array for the hypothesized match 
are inversely transformed from parameter space to depict 
test points in pattern space. 

11. The method of claim 10 wherein the test points are 
superimposed upon the image. 

12. The method of claim 11 wherein a distance transform 
is applied to determine a level-of-match between the test 
points and points of interest in the image. 

13. The method of claim 12 wherein a matching metric is 
computed to quantitatively enumerate a level of match 
between the test points inversely transformed and the 
hypothesized pattern in the image. 

14. The method of claim 13 wherein the root mean square 
average is computed to quantitatively enumerate the level of 
match. 

15. The method of claim 1 wherein the pattern has an 
arbitrary shape. 

16. The method of claim 1 wherein the parameter table is 
the only stored representation of the pattern in parameter 
space utilized in the comparison to the values computed for 
pairs of extracted points. 

17. The method of claim 1 wherein a parallel gradient 
identifier angle is computed for a pair of extracted points as 
follows: 

a line connecting the pair of extracted points has a 
direction angle a, 

the gradient angle at each of the extracted points is 
assigned the value xp, and 
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the parallel gradient identifier angle (|) is computed by is used to look up parameters corresponding to said parallel 

subtracting the direction angle from the gradient angle gradient identifier angle in the parameter table, 

so that (|)=xp— cr. 

18. The method of claim 9 wherein a parallel gradient 
identifier angle is computed for a pair of extracted points and * * * * * 



